Unit 3 topics (that we’ve covered thus far)

Week 10 - Sampling distributions and confidence intervals

Effect size

Confidence intervals provide an (interval) estimate for the effect of interest. Hence, confidence intervals, and not hypothesis tests, can inform us about the effect size. This makes it easier for us to compare the results of our statistical analysis to a practical understanding of what kind of “effect” would be important in a particular problem setting.

Practical significance is not determined by statistical significance! Statistical significance is not determined by practical significance!

Week 11 - Hypothesis tests for an unknown proportion or mean

Hypotheses are typically designed so that what we want to prove is expressed in the alternative. For all of the methods that we’ve covered thus far, the null hypothesis is always going to be of the form \[H_0: \text{<parameter> } = \text{ some number}\]

Types of conclusions

The only way to reduce both types of error is to collect more evidence or, in statistical terms, to collect more data.

  • \(\alpha = Pr(\text{Type I error})\): If \(H_0\) is true, this is the probability that we (incorrectly) reject it.

  • \(\beta = Pr(\text{Type II error})\): If \(H_0\) is false, this is the probability that we (incorrectly) fail to reject it.

  • \(1-\beta = Power\) If \(H_0\) is false, this is the probability that we (correctly) reject it.

The logic of hypothesis tests is similar to the logic behind inter-universe travel in the movie Everything Everywhere All at Once…

Week 12 - Inference from two samples (grouped data)

Example: Confidence interval for a difference in means (from Week 12)

On average, how much more money do consumers spend at Target compared to Walmart?

Suppose researchers collected a systematic sample from \(85\) Walmart customers and \(80\) Target customers by asking them for their purchase amount as they left the stores. The data they collected is summarized in the table below. Suppose a computer already calculated the degrees of freedom to be \(162.75\).

Walmart Target
\(\bar{x}\) \(\$45\) \(\$53\)
s \(\$21\) \(\$19\)

Step 1) Identify and define the population parameter and choose your confidence level.

Step 2) Calculate the sample estimate for the population parameter.

Step 3) Assess the required assumptions and conditions.

Step 4) Find the critical value corresponding to your confidence level.

Step 5) Calculate the standard error of your sample estimate.

Step 6) Calculate the lower and upper bounds of your confidence interval.


Example: Confidence interval for a mean difference of paired data (from Week 13)

On average, how large is the difference in car insurance prices for customers of an online insurance company versus customers of a local insurance company?

Find a \(95\%\) confidence interval for the mean difference in insurance prices based on the data given below.

mean(insurance_diff$PriceDiff)
## [1] 45.9
sd(insurance_diff$PriceDiff)
## [1] 175.6628

Looking ahead


Partial Solutions

Example: Confidence interval for a difference in means (from Week 12)

Step 1) \(\mu_1 - \mu_2 =\) mean amount spent at Target minus mean amount spent at Walmart. We’ll use a 95% confidence level.

Step 2) \(\bar{x}_1 - \bar{x}_2 = 8\)

Step 3) Assess the required assumptions and conditions - done in class.

Step 4) We need the critical \(t^*\) value corresponding to a 0.95 confidence level from a Student’s t distribution with \(162.75\) degrees of freedom. We can find this exactly using R and this value should be similar to the approximate critical value which you can read off the t-table.

qt(0.025, df = 162.75, lower.tail=TRUE)
## [1] -1.974647

Step 5) \(SE(\bar{x}_1 - \bar{x}_2) = \sqrt(\frac{19^2}{80} + \frac{21^2}{85}) = 3.115\)

Step 6) $ 8 (1.975 ) = [$1.848, $14.152]$ with interpretation given in class.


Example: Confidence interval for a mean difference of paired data (from Week 13)

Step 1) Identify and define the population parameter and choose your confidence level.

Step 2) Calculate the sample estimate for the population parameter.

Step 3) Assess the required assumptions and conditions.

Step 4) Find the critical value corresponding to your confidence level.

Step 5) Calculate the standard error of your sample estimate.

Step 6) Calculate the lower and upper bounds of your confidence interval.